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The acute respiratory distress syndrome (ARDS) is a severe inflammatory disease 
manifested as a result of pulmonary and systemic responses to several insults. It is now 
well accepted that genetic variation influences these responses. However, little is known 
about the genes that are responsible for patient susceptibility and outcome of ARDS. 
Methodological flaws are still abundant among genetic association studies with ARDS 
and here, we aimed to highlight the quality criteria where the standards have not been 
reached, to expose the associated genes to facilitate replication attempts, and to provide 
quick-reference guidance for future studies. We conducted a PubMed search from January 
2008 to September 2012 for original articles. Studies were considered if a statistically 
significant association was declared with either susceptibility or outcomes of all-cause 
ARDS. Fourteen criteria were used for evaluation and results were compared to those from 
a previous quality assessment report. Significant improvements affecting study design 
and statistical analysis were detected. However, major issues such as adjustments for the 
underlying population stratification and replication studies remain poorly addressed. 
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INTRODUCTION 

Acute lung injury (ALI) and its severe form, the acute respiratory 
distress syndrome (ARDS), are characterized by acute diffuse lung 
inflammation and non-cardiogenic pulmonary edema resulting 
from increased capillary-alveolar permeability. While ALI and 
ARDS terms continue to be used in the medical literature, their 
definition criteria were recently revised, although a consensus 
has not been reached (Ranieri et al., 2012; Villar et al, 2013). 
New definitions support the categorization of ARDS based on 
the hypoxemia severity under mechanical ventilation, as well as 
on other physiological and clinical parameters, discouraging the 
use of ALI as one of the categories. Hereafter, we will refer to 
this constellation of syndromes using the term ARDS, irrespective 
of the classification used by the studies reviewed (Bernard et al., 
1994). ARDS shows profound incidence variability across coun- 
tries (Rubenfeld et al., 2005; Villar et al., 2013), and it is unknown 
whether differences also exist among ethnic groups (Martin et al., 
2003; Erickson et al, 2009; Linko et al, 2009; Villar et al., 2011) 
and the extent to which demographic, cultural, economical, and 
health system particularities might underlie such differences. 

Predisposing genetic factors can interact with the environ- 
ment to determine the diversity of clinical manifestations, the 
response to treatment and outcomes among ARDS patients 
(Cobb and O'Keefe, 2004; Villar et al, 2004; Rahim et al, 2008). 



Exposing those genetic factors might reveal therapeutic targets 
and a foundation to predict ARDS susceptibility and outcomes. 
Association studies have been widely used for detecting common, 
low-penetrant, genetic variants that are suggested to contribute to 
the genetic architecture of complex diseases (Khoury and Yang, 
1998), including ARDS (Flores et al, 2008). For ARDS, these 
studies have mostly focused on particular biological candidates 
and, only recently, have explored the entire genome (Christie 
et al, 2012). We have previously assessed the quality of statistically 
significant associations of genetic variants with ARDS from 1996 
to 2008 based on major recommendations that support study 
robustness (Flores et al, 2008). We hypothesized that, despite 
this previous evaluation and the availability of well-known stan- 
dard guidelines (Janssens et al., 201 1), many association studies in 
this field continue to be performed without awareness of minimal 
standards and that methodological flaws are still abundant. Here, 
we aimed to identify those quality criteria where the standards 
have not been reached, to expose the associated candidate genes 
to facilitate replication studies, and to create a guidance frame- 
work for ongoing and future studies. For that, we have critically 
assessed statistically significant candidate-gene associations with 
susceptibility or outcome of all-cause ARDS from 2008 to 2012 
using 14 major quality control criteria, and compared the updated 
results with our previous evaluation (Flores et al., 2008). 
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MATERIALS AND METHODS 
LITERATURE SEARCH 

We have previously assessed the quahty of genetic association 
studies supporting susceptibility and/or outcome in adult ARDS 
patients of the period of 1996-2008 (Flores et al, 2008). We have 
now conducted a PubMed search from January 2008 to September 
2012 by utilizing the same keyword combinations for querying 
("polymorphism" and "acute lung injury," "polymorphism" and 
"ARDS," and "polymorphism" and "acute respiratory distress syn- 
drome"). Because of the plausibility that a fraction of risk variants 
for ARDS susceptibility could be also risk factors for outcomes, 
both possibilities were jointly analyzed. The retrieved references 
were then manually reviewed. Excluding meta-analysis, those 
reporting statistically significant associations in adults (p < 0.05) 
for any cause of ALT or ARDS irrespective of the type of genetic 
variants associated, and published in English, were reviewed by 
three of the authors. We are aware that a number of such reported 
associations might be false positives. However, this threshold for 
significance is preferable over a more conservative strategy at this 
stage of field development (Thomas and Clayton, 2004). Finally, 
we considered the gene as the unit of replication (Neale and Sham, 

2004) . 

STUDY ASSESSMENT 

For simplicity, we focused on the 14 most relevant criteria, pre- 
viously utilized by us in Flores et al. (2008), modifying the 
exhaustive list provided by Chanock et al. (2007), scoring each 
item as present or absent. Chi-squared tests were performed in 
SPSS (SPSS Inc., Chicago, IL). 

GENE COVERAGE IN GENOTYPING ARRAYS 

Gene coverage was calculated with the tagger tool (Barrett et al., 

2005) for SNPs with minor allele frequency >5% in the gene 
region captured directly and indirectly by the genome-wide geno- 
typing array utilized (with a multi-marker > 0.8). 

RESULTS 

The PubMed search on the period 2008-2012 allowed a closer 
review of 27 original articles reporting statistically signifi- 
cant association findings on 31 candidate genes with suscep- 
tibility and/or outcomes of all-cause ARDS (Table SI), and 
the first genome wide association study (GWAS) for this 
syndrome (Christie et al., 2012). The latter was excluded 
from the evaluation as its quality control assessment dif- 
fers substantially from those applied to candidate-gene stud- 
ies. A complementary search querying for the syndrome name 
in the HuGeNet Navigator (Yu et al., 2008) gave over- 
lapping results, showing studies for additional genes albeit 
all reporting statistically non-significant findings. We, there- 
fore, continued the quality assessment based on the PubMed 
search. 

Seventeen studies (63%) provided statistically significant find- 
ings with a case-control design and ten (37%) with a cohort. 
These were based on a median sample size of 251 cases 
[interquartile range (Pis-Pys)'- 84-365] and 288 controls (P25- 
P75: 190-724) in case-control studies, whereas for cohort studies 
the median sample size was 145 patients (^25-^*75: 118-215). 



In this period, almost all studies (96%) appropriately described 
demographical and clinical data for cases and all had an adequate 
characterization of the control group (47.1% of them utilized 
healthy subjects or population-based controls and 52.9% opted 
to use at risk patients as controls). However, only 50% of the 
studies explored their power to detect statistically significant 
findings. 

While roughly a third of studies (35%) focused on a single 
variant of the gene under study, the majority (65%) analyzed sev- 
eral polymorphisms attaining appropriate gene coverage of com- 
mon variation by means of linkage disequilibrium (LD)-based 
methods. In most cases (74%), the studies allowed to unambigu- 
ously identify the genomic location of the associated variant(s) 
on public resources. Similarly, most studies declared that Hardy- 
Weinberg equilibrium expectations were assessed (93%), and 
that further genotyping error checks were implemented during 
the study (59%). Almost half of the studies (48%) stated that 
genotyping was performed blind to the disease status of samples. 

Focusing on the statistical analyses, 65% of the studies that 
needed to control type-I error due to multiple hypothesis testing 
did so, and 89% included covariates in the regression analyses. 
The magnitude of effects was appropriately reported in terms of 
hazard ratios (HRs) or odds ratios (ORs) in almost all reviewed 
studies (96%) (Table SI). The adjustment for population stratifi- 
cation and replication, in at least an independent study sample, 
were declared only in 22 and 19% of the studies, respectively, 
two major issues that has not improved over the years (Flores 
et al, 2008) (Figure 1). Similarly, almost half of the studies (44%) 
pursued the functional significance of associated variants. 

On a side-by-side comparison of the two periods reviewed 
to date (i.e., 1996-2008 reviewed by Flores et al, 2008 and this 
one from 2008 to 2012), significant improvements in the qual- 
ity of the published studies were observed in the most recent 
period (Figure 1) affecting study design, study reproducibility, 
and statistical analysis. These improvements were due to an 
increase of studies exploiting the available tools for LD explo- 
ration to efficiently select the genetic variants (from 24 to 67%, 
chi-squared p = 0.003); controlling type-I error by incorporat- 
ing multiple testing adjustments on the analyses (from 10 to 65%, 
chi-squared p = 0.0003); and accurately identifying the genomic 
location of the associated variant(s) (from 45 to 74%, chi-squared 
p = 0.033). 

DISCUSSION 

We have assessed the evidence obtained during 2008-2012 from 
ARDS candidate-gene association studies and compared them 
with our previous assessment to objectively evaluate the evolu- 
tion of the field, especially in light of the methodology applied 
in genetic susceptibility studies. In total, including the evidence 
accumulated before 2008 (Flores et al., 2008), 56 studies on 41 
candidate genes reported statistically significant associations with 
susceptibility or outcomes of all-cause ARDS (Figure 2). 

We detected significant improvements affecting the exploita- 
tion of resources for LD exploration, the inclusion of multiple 
testing adjustments, and the way studies identified the associated 
variants by established recommendations. This was also extensi- 
ble to sample sizes for case-control designs, as these have roughly 
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FIGURE 1 I Histogram comparing quality control scores of association 
studies in ARDS published from 1996 to 2008 (taken from Flores et al., 
2008) and from 2008 until present. Statistically significant improvements 



affected criteria relevant to study design (LD exploration), study 
reproducibility (polymorphism identification) and statistical analysis (multiple 
testing adjustments). *p-value < 0.05; **p-value < 0.001. 



doubled their median sample by group compared to studies pub- 
lished before 2008. Despite this improvement, replications in 
independent studies are needed to improve the association reli- 
ability. Worth noting, the diversity of samples has increased over 
the years, so that across all published studies a few have focused on 
African-Americans (6.6%), while the majority continues to use 
Europeans (66.7%), East Asians (15%), or multiethnic samples 
(11.7%). While all these improvements are stimulating, a down- 
side continues to be recognized on the adjustment for population 
stratification and replication attempts, as these were conducted in 
less than a fifth of all reviewed reports. 

The identification of genuine gene associations with ARDS 
relies on conducting more replication studies, albeit without sac- 
rificing study robustness, as only a few associated genes have been 
replicated to date (Figure 2). Among those genes, ACE was asso- 
ciated several times and a meta-analysis was recently published 
(Matsuda et al., 2012). Although results should be taken with cau- 
tion because of power limitations, they revealed variable effects 
of an ACE polymorphism with ARDS mortality, present in East 
Asians but lacking in Europeans. This illustrates the growing evi- 
dence supporting that genetic risks may be population-specific, 
either because of gene-gene or gene-environment interactions or 
because of frequency effects (Need and Goldstein, 2009). Given 
that we are far from having a complete list of ARDS genes, and 
that an incomplete overlap of genetic risks between populations 
is expected, the study of samples of diverse ancestry should be 
encouraged in future studies. It must be noted that across all 
reviewed studies, genetic associations with ARDS susceptibility 
or outcomes with opposite effects in different ancestry groups 



were absent, despite differences by the ARDS triggering insult 
have been detected (Christie et al., 2008). One major issue that is 
determinant of the robustness of association studies with unre- 
lated individuals is the assessment and adjustment of results 
for the underlying (sometimes cryptic) population stratification, 
which is usually based on data from independent genetic poly- 
morphisms (Price et al., 2006). Still today, more than 80% of 
the published association studies in ARDS did not apply such an 
approach, despite few dozen of very informative genetic variants 
(termed AIMs) have demonstrated their utility in specific pop- 
ulations (Pino-Yanes et al, 2011; Galanter et al, 2012). As the 
studies that focus on particular genomic regions will continue to 
be relevant in the field (Chanock et al., 2007), population strati- 
fication effects should be minimized in future association studies, 
irrespective of the study population being assessed. Therefore, it 
becomes essential to develop efficient and straightforward meth- 
ods that: (1) could be applied to different populations and be 
universally used, and (2) could assist researchers to easily select 
a reduced set of AIMs to accurately assess ancestry maintain- 
ing affordable costs. Such tools would be useful to validate study 
robustness as well as to address the biological differences between 
populations, and whether these may trigger disparities in ARDS 
susceptibility or outcomes. It must be noted; however, that pop- 
ulation stratification also introduces non-genetic effects that will 
not be addressed by these methods. It is expected that analyses 
of these effects and interactions will bring new opportunities and 
challenges in the field (Rotimi and lorde, 2010). 

Establishing the association of genes with ARDS susceptibil- 
ity is only the beginning of a process that should continue with 
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FIGURE 2 1 Diagram showing the official gene symbols for the 41 interleukin 6; /L70, interleukin 10; /L7S, interleukin 18; \L32, interleukin 32; 
candidate genes associated with ARDS susceptibility and outcomes, IHAK3, interleukin-1 receptor-associated kinase 3; LTA, lymphotoxin alpha; 
depicting both chromosome locations and the number of study samples MBL2, mannose-binding lectin 2; MIF, macrophage migration inhibitory 
with statistically significant associations. For each chromosome, lower factor; MYLK, myosin light chain kinase; NAMPT, nicotinamide 
arrowheads indicate the location of genes with a single sample association, phosphoribosyltransferase; NFE2L2, nuclear factor (erythroid-derived 2Hike 
and upper arrowheads indicate the location of genes with statistically 2; NFKBl, nuclear factor of kappa light polypeptide gene enhancer in B-cells 
significant association findings in at least two study samples. Arrowheads 1 ; NFKBIA, nuclear factor of kappa light polypeptide gene enhancer in B-cells 
with asterisk indicate more than one gene in that region. Dots denote that inhibitor alpha; NQ01, NAD(P)H dehydrogenase, quinone 1; PIS, peptidase 
the gene was replicated in the only GWAS of ARDS published to date. inhibitor 3; PLAU, plasminogen activator, urokinase; PPARGCIA, peroxisome 
Underlined gene names indicate that the product has been suggested as a proliferator-activated receptor gamma, coactivator 1 alpha; SFTPA1, surfactant 
biomarker for ARDS or its progression in at least one study. ACE, angiotensin protein Al ; SERPINEl, serpin peptidase inhibitor, clade E (nexin, plasminogen 
1 converting enzyme; ANGPT2, angiopoietin 2; CXCL2, chemokine (C-X-C activator inhibitor type 1), member 1 ; SFTPA2, surfactant protein A2; SFTPB, 
motif) ligand 2; DARC, duffy blood group, chemokine receptor; DI02, surfactant protein B; SFTPD, surfactant protein D; S0D3, superoxide 
deiodinase, iodothyronine, type II; EGF, epidermal growth factor; F5, dismutase 3; STAT1, signal transducer and activator of transcription 1, 91 kDa; 
coagulation factor V (proaccelerin, labile factor); FAS, TNF receptor TIRAP, toll-interleukin 1 receptor (TIR) domain containing adaptor protein; 
superfamily, member 6; FTL, ferritin, light polypeptide; GPS, glycoprotein V TLRl, toll-like receptor 1 ; TNF, tumor necrosis factor; TRAPS, TNF 
(platelet); HMOXl, heme oxygenase 1 ; HM0X2, heme oxygenase 2; IL6, receptor-associated factor 6; VEGFA, vascular endothelial growth factor A. 



the discovery of the causal genetic variants. The challenge contin- 
ues to be the validation of existing and novel ARDS associations 
via robust studies, and future and ongoing studies should amend 
the critical issues here recognized. In this effort, new technologies 
are allowing a faster field development by means of genome-wide 
studies, either using genotyping arrays or exome/whole genome 
sequencing. GWAS are as efficient as candidate-gene studies for 



detecting weak effect risks, not requiring a previous hypothe- 
sis of the biological processes related to the trait. They have 
allowed to identify new disease genes never anticipated and led 
to new hypothesis and perspectives about disease pathogenesis 
(Marchini and Howie, 2010). Despite that, GWAS have major 
limitations including high costs, usually impacting on the sam- 
ple size, the statistical burden and the gene coverage. In addition. 
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most commercial platforms may offer less coverage for the gene(s) 
of interest compared to that achieved in optimal candidate- 
gene studies, which can substantially impact study power (Voight 
et al, 2012). The first GWAS of ARDS was recently published by 
Christie et al. (2012), revealing PPFIAl as a novel susceptibility 
gene involved in cell adhesion and cell-matrix interactions, and 
suggesting many others with putative functional roles. This study 
also replicated the association of four candidate genes including 
ILIO, MYLK, ANGPT2, and FAS. This may suggest that all other 
candidate gene associations should be considered false discover- 
ies. However, one explanation for this inconsistency could be also 
the insufficient GWAS coverage of the non-associated candidate 
genes (average ~57%; Table S2). Whatever the case, commercial 
platforms will only allow studying a fraction of the millions of 
existing genetic variants (Abecasis et al., 2012), and it is antici- 
pated that the associations to be revealed will only explain a small 
component of the disease (Manolio et al, 2009). Only complete 
re-sequencing of individual genomes wiU guarantee the analysis 
of all genetic variation. 

Here we have shown that the field still faces several method- 
ological challenges, and in the clinical arena there are key issues 
to be improved in order to fully understand the genetic pro- 
cesses underlying ARDS. Misclassification of phenotypes can 
lead to significant reduction in statistical power to detect true 
genetic associations, therefore it becomes necessary a better and 
more homogeneous patient classification. This could be achieved 
by combining the clinical information with different integra- 
tive approaches, those based on the determination of the causal 
microorganisms by means of metagenomics (Lysholm et al., 
2012) or performing gene expression profiling among patients 
(Hu et al., 2012), to name a few. As a proof of concept, in a recent 
study by O'Mahony et al. (2012), only when the samples were 
restricted to the more severe phenotype, new associations were 
revealed and previous findings were replicated. Furthermore, 
quantitative phenotypes could be utilized for association testing, 
such as ventilator-free days (Kangelaris et al, 2012) or ideally 
other traits that are closer to the genotype. This possibility has 
been explored in the field with striking (Wurfel et al., 2008) and 
replicable results (Pino-Yanes et al., 2010). Additionally, the selec- 
tion of the control samples remains a challenge; it is not an easy 
task and not a single design is free of bias. The use of either healthy 
subjects or at-risk individuals is common among the reviewed 
studies. An alternative solution can be the utilization of both types 
of controls to reduce selection biases and be able to confidentially 
assess the quality of the genotypic data. This strategy has been 
used (Song et al, 2010), and will surely reduce the chances that 
risk variants reported are causally associated with a confounder 
and not with ARDS. 

In summary, the methodology for assessing genetic risks in 
complex diseases is under development. For ARDS, we conclude 
that the main challenge continues to be in providing an analyti- 
cally rigorous methodology (adjusting for population stratifica- 
tion, relatedness, and technical quality) accompanied by inde- 
pendent replication and mechanistic explanations for the results 
provided. Still today, the evidence supporting the genetic associ- 
ations with ARDS susceptibility or outcomes is at best uncertain, 
given the limited statistical power of most studies and the effects 



expected for genetic variants involved in complex traits. To guar- 
antee proper and high quality studies on genetic susceptibility 
and outcomes, we strongly encourage the use of large and well- 
defined collection of samples. Consequently, a shift toward the 
establishment of international consortia will be necessary. 
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